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1 Enhancing the exploitation of data mining in relational 100% 
12 database systems via the rough sets theory including precision 

variables 

Fernando Machuca , Marta Millan 

Proceedings of the 1998 ACM symposium on Applied Computing 
February 1998 

2 Database mining challenges for digital libraries 100% 
Q\ Robert Grossman 

ACM Computing Surveys (CSUR) December 1996 

3 Using domain knowledge in knowledge discovery 99% 
Suk-Chung Yoon , Lawrence J. Henschen , E. K. Park , Sam Makki 
Proceedings of the eighth international conference on Information 

and knowledge management November 1999 

With the explosive growth of the size of databases, many 
knowledge discovery applications deal with large quantities of 
data. There is an urgent need to develop methodologies which 
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will allow the applications to focus search to a potentially 
interesting and relevant portion of the data, which can reduce 
the computational complexity of the l<nowledge discovery 
process and improve the interestingness of discovered 
knowledge. Previous work on semantic query optimization, 
which is an approach to ... 



4 Hypertext databases and data mining 99% 
13 Soumen Chakrabarti 

ACM SIGMOD Record , Proceedings of the 1999 ACM SIGMOD 
international conference on Management of data June 1999 
Volume 28 Issue 2 

The volume of unstructured text and hypertext data far exceeds 
that of structured data. Text and hypertext are used for digital 
libraries, product catalogs, reviews, newsgroups, medical 
reports, customer service reports, and the like. Currently 
measured in billions of dollars, the worldwide internet activity is 
expected to reach a trillion dollars by 2002. Database 
researchers have kept some cautious distance from this action. 
The goal of this tutorial is to expose database researchers to t ... 



5 An overview of data warehousing and OLAP technology 99% 

[2 Surajit Chaudhuri , Umeshwar Dayal 
ACM SIGMOD Record March 1997 
Volume 26 Issue 1 

Data warehousing and on-line analytical processing (OLAP) are 
essential elements of decision support, which has increasingly 
become a focus of the database industry. Many commercial 
products and services are now available, and all of the principal 
database management system vendors now have offerings In 
these areas. Decision support places some rather different 
requirements on database technology compared to traditional 
on-line transaction processing applications. This paper provides 
an overview ... 



6 Automatically extracting structure and data from business 99% 
[3 reports 

Stephen W. Liddle , Douglas M. Campbell , Chad Crawford 
Proceedings of the eighth international conference on Information 
and knowledge management November 1999 

A considerable amount of clean semistructured data is internally 
available to companies in the form of business reports. However, 
business reports are untapped for data mining, data 
warehousing, and querying because they are not in relational 
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form. Business reports have a regular structure that can be 
reconstructed. We present algorithnns that automatically infer 
the regular structure underlying business reports and 
automatically generate wrappers to extract relational data. 

7 Scalable algorithms for mining large databases 99% 
2) Rajeev Rastogi , Kyuseok Shim 

Tutorial notes of the fifth ACM SIGKDD international conference on 
Knowledge discovery and data mining August 1999 

8 Multilingual “Worldtrek” for authoring and 99% 
[2 comprehension 

Marie-Luce Picard , Eric Boudaillier 

Proceedings of the 4th international conference on Intelligent user 
interfaces December 1998 

9 Database research at Columbia University 99% 
[2 Shih-Fu Chang , Luis Gravano , Gail E. Kaiser , Kenneth A. Ross , 

Salvatore J. Stolfo 

ACM SIGMOD Record September 1998 
Volume 27 Issue 3 



10 The IBM data warehouse architecture 99% 
[2 Charles Bontempo , George Zagelow 

Communications of the ACM September 1998 

Volume 41 Issue 9 



11 Classification and regression: money *can* grow on trees 98% 

12 Johannes Gehrke , Wie-Yin Loh , Raghu Ramakrishnan 

Tutorial notes of the fifth ACM SIGKDD international conference on 

Knowledge discovery and data mining August 1999 

With over 800 million pages covering most areas of human 
endeavor, the World-wide Web is a fertile ground for data 
mining research to make a difference to the effectiveness of 
information search. Today, Web surfers access the Web through 
two dominant interfaces clicking on hyperlinks and searching via 
keyword queries This process is often tentative and 
unsatisfactory Better support is needed for expressing one's 
information need and dealing with a search result in more 
structured ways than ... 
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2| Karuna P. Joshi , Anupam Joshi , Yelena Yesha , Raghu 
Krishnapuram 

Proceedings of the second international worl<shop on Web 
information and data management November 1999 

Analyzing Web Logs for usage and access trends can not only 
provide important information to web site developers and 
administrators, but also help in creating adaptive web sites. 
While there are many existing tools that generate fixed reports 
from web logs, they typically do not allow ad-hoc analysis 
queries. Moreover, such tools cannot discover hidden patterns of 
access embedded in the access logs. We describe a relational 
OI_AP (ROLAP) approach for creating a web-log warehouse. This 
is pop ... 



13 Supporting storage and retrieval of computer and human 97% 

13 activity 

Mark D. Spiteri , John Bates 

Proceedings of the 8th ACM SIGOPS European workshop on Support 
for composing distributed applications September 1998 

14 Enhanced hypertext categorization using hyperlinks 97% 
Q\ Soumen Chakrabarti , Byron Dom , Piotr Indyk 

ACM SIGMOD Record , Proceedings of the 1998 ACM SIGMOD 
international conference on Management of data June 1998 
Volume 27 Issue 2 

A major challenge in indexing unstructured hypertext databases 
is to automatically extract meta-data that enables structured 
search using topic taxonomies, circumvents keyword ambiguity, 
and improves the quality of search and profile-based routing and 
filtering. Therefore, an accurate classifier is an essential 
component of a hypertext database. Hyperlinks pose new 
problems not addressed in the extensive text classification 
literature. Links clearly contain high-quality semantic clues that 



15 Implementing catalog clearinghouses with XML and XSL 96% 
U Andrew V. Royappa 

Proceedings of the 1999 ACM symposium on Applied computing 
February 1999 

16 Monitoring a newsfeed for hot topics 95% 
U Mark Shewhart , Mark Wasson 

Proceedings of the fifth ACM SIGKDD international conference on 
Knowledge discovery and data mining August 1999 
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17 Evolution of a user interface design: NCR's management 95% 

3l discovery tool (MDT) 

James F. Knutson , Tej Anand , Richard L. Henneman 
Proceedings of tine SIGCHI conference on Human factors In 
computing systems Inarch 1997 



18 l^ind your vocabulary: query mapping across heterogeneous 95% 
3 information sources 

Chen-Chuan K. Chang , Hector Garcfa-I^olina 
ACM SIGI^OD Record , Proceedings of the 1999 ACM SIGMOD 
international conference on Management of data June 1999 
Volume 28 Issue 2 

In this paper we present a mechanism for translating constraint 
queries, i.e., Boolean expressions of constraints, across 
heterogeneous information sources. Integrating such systems is 
difficult in part because they use a wide range of constraints as 
the vocabulary for formulating queries. We describe algorithms 
that apply user-provided mapping rules to translate query 
constraints into ones that are understood and supported in 
another context, e.g.< ... 



19 CKOS and knowledge management: exploring opportunities for 95% 
(3 using information exchange protocols 
Richard T. Herschel , Hamid R. Nemati 

Proceedings of the 1999 ACM SIGCPR conference on Computer 
personnel research April 1999 
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